Presentation: Tweet"Enabling a Large Internal Analytic User Base with Hadoop & Co"
Criteo is a performance driven company that requires a minimum of analytic prowess from all of its employees. We process up to 50 billion real time bidding requests daily resulting in an ingest rate on the order of 25TB/day to a 1000+ node Hadoop cluster. The many petabyte-scale data warehouse is accessed directly by hundreds of engineers, analysts and data scientists and indirectly by over a thousand operational users via domain specific datamarts and reporting systems.
In this talk I will detail our Hadoop-based analytics stack and how we take advantage of technologies like Scalding, Hive, Vertica and Tableau (plus a few in-house items) to facilitate our different user groups' daily interactions with it.
Download slides